Joint speech and audio coding combining sinusoidal modeling and wavelet packets

نویسندگان

  • Márk Fék
  • Annamária R. Várkonyi-Kóczy
  • Jean-Marc Boucher
چکیده

This paper presents a joint speech and audio coding algorithm combining sinusoidal modeling and a perceptually adapted Wavelet Packet Transform (WPT). The input signal is limited to the band of 50-7000 Hz, and sampled at 16 kHz. The sinusoidal modeling uses a Sinusoidal Similarity Measure (SSM) to find stable sinusoidal components. A novel pitch harmonics based encoding is applied to encode the sinusoidal frequencies. The residual is obtained by extracting the re-synthesized sinusoids from the input, and is processed by a WPT simulating the critical bands of the Human Auditory System. Perceptual Noise Substitution (PNS) is applied in noisy WPT sub-bands to reduce the bit rate. The method provides nearly transparent quality for both speech and audio inputs. The mean bit rate of the compressed signal varies between 32-62 kbps depending on the input. Demonstration sound files are available at www-sc.enstbretagne.fr/ ̃fek/eurospeech01.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Algorithm for Voice Activity Detection Based on Wavelet Packets (RESEARCH NOTE)

Speech constitutes much of the communicated information; most other perceived audio signals do not carry nearly as much information. Indeed, much of the non-speech signals maybe classified as ‘noise’ in human communication. The process of separating conversational speech and noise is termed voice activity detection (VAD). This paper describes a new approach to VAD which is based on the Wavelet ...

متن کامل

Adaptive Signal Models: Theory, Algorithms, and Audio Applications

Adaptive Signal Models: Theory, Algorithms, and Audio Applications by Michael Mark Goodwin Doctor of Philosophy in Engineering|Electrical Engineering and Computer Science University of California, Berkeley Professor Edward A. Lee, Chair Mathematical models of natural signals have long been of interest in the scienti c community. A primary example is the Fourier model, which was introduced to ex...

متن کامل

Multiresolution sinusoidal modeling using adaptive segmentation

The sinusoidal model has proven useful for representation and modi cation of speech and audio. One drawback, however, is that a sinusoidal signal model is typically derived using a xed frame size, which corresponds to a rigid signal segmentation. For nonstationary signals, the resolution limitations that result from this rigidity lead to reconstruction artifacts. It is shown in this paper that ...

متن کامل

Amplitude Modulated Sinusoidal Models for Audio Modeling and Coding

In this paper a new perspective on modeling of transient phenomena in the context of sinusoidal audio modeling and coding is presented. In our approach the task of nding time-varying amplitudes for sinusoidal models is viewed as an AM demodulation problem. A general perfect reconstruction framework for amplitude modulated sinusoids is introduced and model reductions lead to a model for audio co...

متن کامل

Transitional speech segments modeling by matching pursuit with a dictionary based on the psychoacoustic adaptive WP

In this paper transitional speech segments modeling by matching pursuit is proposed. The dictionary for matching pursuit is composed of wavelet functions that implement of psychoacoustic adaptive wavelet filter bank. Psychoacoustically motivated entropy based cost functions allow to greatly minimizing a number of time-frequency atoms in wavelet packet (WP) dictionary. The given transient modeli...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001